Audio encoding/decoding based on an efficient representation of auto-regressive coefficients

ABSTRACT

An encoder for encoding a parametric spectral representation (ƒ) of auto-regressive coefficients that partially represent an audio signal. The encoder includes a low-frequency encoder configured to quantize elements of a part of the parametric spectral representation that correspond to a low-frequency part of the audio signal. It also includes a high-frequency encoder configured to encode a high-frequency part (ƒ H ) of the parametric spectral representation (ƒ) by weighted averaging based on the quantized elements ({circumflex over (ƒ)} L ) flipped around a quantized mirroring frequency ({circumflex over (ƒ)} m ), which separates the low-frequency part from the high-frequency part, and a frequency grid determined from a frequency grid codebook in a closed-loop search procedure. Described are also a corresponding decoder, corresponding encoding/decoding methods and UEs including such an encoder/decoder.

RELATED APPLICATIONS

The present application is a continuation of co-pending U.S. patentapplication Ser. No. 14/994,561, filed 13 Jan. 2016, which is acontinuation of application Ser. No. 14/355,031, filed 29 Apr. 2014 andissued as U.S. Pat. No. 9,269,364 on 23 Feb. 2016, which application wasa national stage entry under 35 U.S.C. § 371 of international patentapplication serial no. PCT/SE2012/050520, filed 15 May 2012, claimingpriority to and the benefit of U.S. provisional patent application Ser.No. 61/554,647, filed 2 Nov. 2011. The entire contents of each of theaforementioned applications is incorporated herein by reference.

TECHNICAL FIELD

The technology disclosed herein relates to audio encoding/decoding basedon an efficient representation of auto-regression (AR) coefficients.

BACKGROUND

AR analysis is commonly used in both time [1] and transform domain audiocoding [2]. Different applications use AR vectors of different length.The model order is mainly dependent on the bandwidth of the codedsignal; from 10 coefficients for signals with a bandwidth of 4 kHz, to24 coefficients for signals with a bandwidth of 16 kHz. These ARcoefficients are quantized with split, multistage vector quantization(VQ), which guarantees nearly transparent reconstruction. However,conventional quantization schemes are not designed for the case when ARcoefficients model high audio frequencies, for example above 6 kHz, andwhen the quantization is operated with very limited bit-budgets (whichdo not allow transparent coding of the coefficients). This introduceslarge perceptual errors in the reconstructed signal when theseconventional quantization schemes are used at non-optimal frequencyranges and with non-optimal bitrates.

SUMMARY

An object of the disclosed technology is a more efficient quantizationscheme for the auto-regressive coefficients. This objective may beachieved with several of the embodiments disclosed herein.

A first aspect of the technology described herein involves a method ofencoding a parametric spectral representation of auto-regressivecoefficients that partially represent an audio signal. An example methodincludes the following steps: encoding a low-frequency part of theparametric spectral representation by quantizing elements of theparametric spectral representation that correspond to a low-frequencypart of the audio signal; and encoding a high-frequency part of theparametric spectral representation by weighted averaging based on thequantized elements flipped around a quantized mirroring frequency, whichseparates the low-frequency part from the high-frequency part, and afrequency grid determined from a frequency grid codebook in aclosed-loop search procedure.

A second aspect of the technology described herein involves a method ofdecoding an encoded parametric spectral representation ofauto-regressive coefficients that partially represent an audio signal.An example method includes the following steps: reconstructing elementsof a low-frequency part of the parametric spectral representationcorresponding to a low-frequency part of the audio signal from at leastone quantization index encoding that part of the parametric spectralrepresentation; and reconstructing elements of a high-frequency part ofthe parametric spectral representation by weighted averaging based onthe decoded elements flipped around a decoded mirroring frequency, whichseparates the low-frequency part from the high-frequency part, and adecoded frequency grid.

A third aspect of the technology described herein involves an encoderfor encoding a parametric spectral representation of auto-regressivecoefficients that partially represent an audio signal. An exampleencoder includes: a low-frequency encoder configured to encode alow-frequency part of the parametric spectral representation byquantizing elements of the parametric spectral representation thatcorrespond to a low-frequency part of the audio signal; and ahigh-frequency encoder configured to encode a high-frequency part of theparametric spectral representation by weighted averaging based on thequantized elements flipped around a quantized mirroring frequency, whichseparates the low-frequency part from the high-frequency part, and afrequency grid determined from a frequency grid codebook in aclosed-loop search procedure. A fourth aspect of the technologydescribed herein involves a UE including the encoder in accordance withthe third aspect.

A fifth aspect involves a decoder for decoding an encoded parametricspectral representation of auto-regressive coefficients that partiallyrepresent an audio signal. An example decoder includes: a low-frequencydecoder configured to reconstruct elements of a low-frequency part ofthe parametric spectral representation corresponding to a low-frequencypart of the audio signal from at least one quantization index encodingthat part of the parametric spectral representation; and ahigh-frequency decoder configured to reconstruct elements of ahigh-frequency part of the parametric spectral representation byweighted averaging based on the decoded elements flipped around adecoded mirroring frequency, which separates the low-frequency part fromthe high-frequency part, and a decoded frequency grid. A sixth aspect ofthe technology described herein involves a UE including the decoder inaccordance with the fifth aspect.

The technology detailed below provides a low-bitrate scheme forcompression or encoding of auto-regressive coefficients. In addition toperceptual improvements, the technology also has the advantage ofreducing the computational complexity in comparison tofull-spectrum-quantization methods.

BRIEF DESCRIPTION OF THE DRAWINGS

The disclosed technology, together with further objects and advantagesthereof, may best be understood by making reference to the followingdescription taken together with the accompanying drawings, in which:

FIG. 1 is a flow chart of the encoding method in accordance with thedisclosed technology;

FIG. 2 illustrates an embodiment of the encoder side method of thedisclosed technology;

FIG. 3 illustrates flipping of quantized low-frequency LSF elements(represented by black dots) to high frequency by mirroring them to thespace previously occupied by the upper half of the LSF vector;

FIG. 4 illustrates the effect of grid smoothing on a signal spectrum;

FIG. 5 is a block diagram of an embodiment of the encoder in accordancewith the disclosed technology;

FIG. 6 is a block diagram of an embodiment of the encoder in accordancewith the disclosed technology;

FIG. 7 is a flow chart of the decoding method in accordance with thedisclosed technology;

FIG. 8 illustrates an embodiment of the decoder side method of thedisclosed technology;

FIG. 9 is a block diagram of an embodiment of the decoder in accordancewith the disclosed technology;

FIG. 10 is a block diagram of an embodiment of the decoder in accordancewith the disclosed technology;

FIG. 11 is a block diagram of an embodiment of the encoder in accordancewith the disclosed technology;

FIG. 12 is a block diagram of an embodiment of the decoder in accordancewith the disclosed technology;

FIG. 13 illustrates an embodiment of a user equipment including anencoder in accordance with the disclosed technology; and

FIG. 14 illustrates an embodiment of a user equipment including adecoder in accordance with the disclosed technology.

DETAILED DESCRIPTION

The disclosed technology requires as input a vector a of AR coefficients(another commonly used name is linear prediction (LP) coefficients).These are typically obtained by first computing the autocorrelationsr(j) of the windowed audio segment s(n), n=1, . . . , N, i.e.:

$\begin{matrix}{{{r(j)} = {\sum\limits_{n = j}^{N}{{s(n)}{s\left( {n - j} \right)}}}},\mspace{14mu}{j = 0},\ldots\mspace{14mu},M} & (1)\end{matrix}$where M is pre-defined model order. Then the AR coefficients a areobtained from the autocorrelation sequence r(j) through theLevinson-Durbin algorithm [3].

In an audio communication system AR coefficients have to be efficientlytransmitted from the encoder to the decoder part of the system. In thedisclosed technology this is achieved by quantizing only certaincoefficients, and representing the remaining coefficients with only asmall number of bits.

Encoder

FIG. 1 is a flow chart of the encoding method in accordance with thedisclosed technology. Step S1 encodes a low-frequency part of theparametric spectral representation by quantizing elements of theparametric spectral representation that correspond to a low-frequencypart of the audio signal. Step S2 encodes a high-frequency part of theparametric spectral representation by weighted averaging based on thequantized elements flipped around a quantized mirroring frequency, whichseparates the low-frequency part from the high-frequency part, and afrequency grid determined from a frequency grid codebook in aclosed-loop search procedure.

FIG. 2 illustrates steps performed on the encoder side of an embodimentof the disclosed technology. First the AR coefficients are converted toan Line Spectral frequencies (LSF) representation in step S3, e.g. bythe algorithm described in [4]. Then the LSF vector ƒ is split into twoparts, denoted as low (L) and high-frequency (H) parts in step S4. Forexample in a 10 dimensional LSF vector the first 5 coefficients may beassigned to the L subvector ƒ^(L) and the remaining coefficients to theH subvector ƒ^(H).

Although the disclosed technology will be described with reference to anLSF representation, the general concepts may also be applied to analternative implementation in which the AR vector is converted toanother parametric spectral representation, such as Line Spectral Pair(LSP) or Immitance Spectral Pairs (ISP) instead of LSF.

Only the low-frequency LSF subvector ƒ^(L) is quantized in step S5, andits quantization indices I_(ƒL) are transmitted to the decoder. Thehigh-frequency LSFs of the subvector ƒ^(H) are not quantized, but onlyused in the quantization of a mirroring frequency ƒ_(m) (to {circumflexover (ƒ)}_(m)), and the closed loop search for an optimal frequency gridg^(opt) from a set of frequency grids g^(i) forming a frequency gridcodebook, as described with reference to equations (2)-(13) below. Thequantization indices I_(m) and I_(g) for the mirroring frequency andoptimal frequency grid, respectively, represent the coded high-frequencyLSF vector ƒ^(H) and are transmitted to the decoder. The encoding of thehigh-frequency subvector ƒ^(H) will occasionally be referred to as“extrapolation” in the following description.

In the disclosed embodiment quantization is based on a set of scalarquantizers (SQs) individually optimized on the statistical properties ofthe above parameters. In an alternative implementation the LSF elementscould be sent to a vector quantizer (VQ) or one can even train a VQ forthe combined set of parameters (LSFs, mirroring frequency, and optimalgrid).

The low-frequency LSFs of subvector ƒ^(L) are in step S6 flipped intothe space spanned by the high-frequency LSFs of subvector ƒ^(H). Thisoperation is illustrated in FIG. 3. First the quantized mirroringfrequency {circumflex over (ƒ)}_(m) is calculated in accordance with:{circumflex over (ƒ)}_(m) =Q(ƒ(M/2)−{circumflex over (ƒ)}(M/2−1)  (2)where ƒ denotes the entire LSF vector, and Q(⋅) is the quantization ofthe difference between the first element in ƒ^(H) (namely ƒ(M/2)) andthe last quantized element in ƒ^(L) (namely {circumflex over(ƒ)}(M/2−1)), and where M denotes the total number of elements in theparametric spectral representation.

Next the flipped LSFs ƒ_(flip)(k) are calculated in accordance with:ƒ_(flip)(k)=2{circumflex over (ƒ)}_(m)−{circumflex over (ƒ)}(M/2−1=k),0≤k≤M/2−1  (3)Then the flipped LSFs are rescaled so that they will be bound within therange [0 . . . 0.5] (as an alternative the range can be represented inradians as [0 . . . π]) in accordance with:

$\begin{matrix}{{{\overset{\sim}{f}}_{flip}(k)} = \left\{ \begin{matrix}{{{\left( {{f_{flip}(k)} - {f_{flip}(0)}} \right) \cdot {\left( {f_{\max} - {\overset{\hat{}}{f}}_{m}} \right)/{\overset{\hat{}}{f}}_{m}}} + {f_{flip}(0)}}\ ,} & {{\overset{\hat{}}{f}}_{m} > {{0.2}5}} \\{{f_{flip}(k)}\ ,} & {otherwise}\end{matrix} \right.} & (4)\end{matrix}$

The frequency grids g^(i) are rescaled to fit into the interval betweenthe last quantized LSF element {circumflex over (ƒ)}(M/2−1) and amaximum grid point value g_(max), i.e.:{tilde over (g)}^(i)(k)=g ^(i)(k)·(g _(max)−{circumflex over(ƒ)}(M/2−1)+ƒ(M/2−1)  (5)These flipped and rescaled coefficients {tilde over (ƒ)}_(flip) (k)(collectively denoted {tilde over (ƒ)}^(H) in FIG. 2) are furtherprocessed in step S7 by smoothing with the rescaled frequency grids{tilde over (g)}^(i)(k). Smoothing has the form of a weighted sumbetween flipped and rescaled LSFs {tilde over (ƒ)}_(flip)(k) and therescaled frequency grids {tilde over (g)}^(i)(k), in accordance with:ƒ_(smooth)(k)=[1−λ(k)]{tilde over (ƒ)}_(flip)(k)+λ(k){tilde over (g)}^(i)(k)  (6)where λ(k) and [1−λ(k)] are predefined weights.

Since equation (6) includes a free index i, this means that a vectorƒ_(smooth)(k) will be generated for each {tilde over (g)}^(i)(k). Thus,equation (6) may be expressed as:ƒ_(smooth) ^(i)(k)=[1−λ(k)]{tilde over (ƒ)}_(flip)(k){tilde over (g)}^(i)(k)  (7)

The smoothing is performed step S7 in a closed loop search over allfrequency grids g^(i), to find the one that minimizes a pre-definedcriterion (described after equation (12) below).

For M/2=5 the weights λ(k) in equation (7) can be chosen as:λ={0.2,0.35,0.5,0.75,0.8}  (8)

In an embodiment these constants are perceptually optimized (differentsets of values are suggested, and the set that maximized quality, asreported by a panel of listeners, are finally selected). Generally thevalues of elements in λ increase as the index k increases. Since ahigher index corresponds to a higher-frequency, the higher frequenciesof the resulting spectrum are more influenced by {tilde over (g)}^(i)(k)than by {tilde over (ƒ)}_(flip) (see equation (7)). This result of thissmoothing or weighted averaging is a more flat spectrum towards the highfrequencies (the spectrum structure potentially introduced by 7_(flip)is progressively removed towards high frequencies).

Here g_(max) is selected close to but less than 0.5. In this exampleg_(max) is selected equal to 0.49.

The method in this example uses 4 trained grids g^(i) (less or moregrids are possible). Template grid vectors on a range [0 . . . 1],pre-stored in memory, are of the form:

$\begin{matrix}\left\{ \begin{matrix}{g^{1} = \left\{ {{{0.1}7274857},\ {{0.3}5811835},\ {{0.5}2369229},\ {{0.7}1552804},\ {{0.8}5539771}}\  \right\}} \\{g^{2} = \left\{ {{{0.1}6313042},\ {{0.3}0782962},\ {{0.4}3109281},\ {{0.5}9395830},\ {{0.8}1291897}} \right\}} \\{g^{3} = \left\{ {{{0.1}7172427},\ {{0.3}3157177},\ {{0.4}8528862},\ {{0.6}6492442},\ {{0.8}2952486}} \right\}} \\{g^{4} = \left\{ {{{0.1}6666667},\ {{0.3}3333333},\ {{0.5}0000000},\ {{0.6}6666667},\ {{0.8}3333333}} \right\}}\end{matrix} \right. & (9)\end{matrix}$

If we assume that the position of the last quantized LSF coefficient{circumflex over (ƒ)}(M/2−1) is 0.25, the rescaled grid vectors take theform:

$\begin{matrix}\left\{ \begin{matrix}{{\overset{\sim}{g}}^{1} = \left\{ {{{0.2}915},\ {{0.3}359},\ {{0.3}757},\ {{0.4}217},\ {{0.4}553}} \right\}} \\{{\overset{\sim}{g}}^{2} = \left\{ {{{0.2}892},\ {{0.3}239},\ {{0.3}535},\ {{0.3}925},\ {{0.4}451}} \right\}} \\{{\overset{\sim}{g}}^{3} = \left\{ {{{0.2}912},\ {{0.3}296},\ {{0.3}665},\ {{0.4}096},\ {{0.4}491}} \right\}} \\{{\overset{\sim}{g}}^{4} = \left\{ {{{0.2}900},\ {{0.3}300},\ {{0.3}700},\ {{0.4}100},\ {{0.4}500}} \right\}}\end{matrix} \right. & (10)\end{matrix}$

An example of the effect of smoothing the flipped and rescaled LSFcoefficients to the grid points is illustrated in FIG. 4. Withincreasing number of grid vectors used in the closed loop procedure, theresulting spectrum gets closer and closer to the target spectrum.

If g_(max)=0.5 instead of 0.49, the frequency grid codebook may insteadbe formed by:

$\begin{matrix}\left\{ \begin{matrix}{g^{1} = \left\{ {{{0.1}5998503},\ {{0.3}1215086},\ {{0.4}7349756},\ {{0.6}6540429},\ {{0.8}4043882}} \right\}} \\{g^{2} = \left\{ {{{0.1}5614473},\ {{0.3}0697672},\ {{0.4}5619822},\ {{0.6}2493785},\ {{0.7}7798001}}\  \right\}} \\{g^{3} = \left\{ {{{0.1}4185823},\ {{0.2}6648724},\ {{0.3}9740108},\ {{0.5}5685745},\ {{0.7}4688616}} \right\}} \\{g^{4} = \left\{ {{{0.1}5416561}\ ,\ {{0.2}7238427},\ {{0.3}9376780},\ {{0.5}9287916},\ {{0.8}6613986}} \right\}}\end{matrix} \right. & (11)\end{matrix}$

If we again assume that the position of the last quantized LSFcoefficient {circumflex over (ƒ)}(M/2−1) is 0.25, the rescaled gridvectors take the form:

$\begin{matrix}\left\{ \begin{matrix}{{\overset{\sim}{g}}^{1} = \left\{ \ {{{0.2}8999626},\ {{0.3}2803772},\ {{0.3}6837439},\ {{0.4}1635107},\ {{0.4}6010970}} \right\}} \\{{\overset{\sim}{g}}^{2} = \left\{ {{{0.2}8903618},\ {{0.3}2674418},\ {{0.3}6404956},\ {{0.4}0623446},\ {{0.4}4449500}} \right\}} \\{{\overset{\sim}{g}}^{3} = \left\{ {{{0.2}8546456},\ {{0.3}1662181},\ {{0.3}4935027},\ {{0.3}8921436},\ {{0.4}3672154}} \right\}} \\{{\overset{\sim}{g}}^{4} = \left\{ {{{0.2}8854140},\ {{0.3}1809607},\ {{0.3}4844195},\ {{0.3}9821979},\ {{0.4}6653496}} \right\}}\end{matrix} \right. & (12)\end{matrix}$

It is noted that the rescaled grids {tilde over (g)}^(i) may bedifferent from frame to frame, since ƒ(M/2−1) in rescaling equation (5)may not be constant but vary with time. However, the codebook formed bythe template grids g¹ is constant. In this sense the rescaled grids{tilde over (g)}¹ may be considered as an adaptive codebook formed froma fixed codebook of template grids g^(i).

The LSF vectors ƒ^(i) _(smooth) created by the weighted sum in (7) arecompared to the target LSF vector ƒ^(H), and the optimal grid g¹ isselected as the one that minimizes the mean-squared error (MSE) betweenthese two vectors. The index opt of this optimal grid may mathematicallybe expressed as:

$\begin{matrix}{{opt} = {\underset{i}{\arg\min}\left( {\sum\limits_{k = 0}^{{M/2} - 1}\left( {{f_{smooth}^{i}(k)} - {f^{H}(k)}} \right)^{2}} \right)}} & (13)\end{matrix}$

where ƒ^(H)(k) is a target vector formed by the elements of thehigh-frequency part of the parametric spectral representation.

In an alternative implementation one can use more advanced errormeasures that mimic spectral distortion (SD), e.g., inverse harmonicmean or other weighting on the LSF domain.

In an embodiment the frequency grid codebook is obtained with a K-meansclustering algorithm on a large set of LSF vectors, which has beenextracted from a speech database. The grid vectors in equations (9) and(11) are selected as the ones that, after rescaling in accordance withequation (5) and weighted averaging with {tilde over (ƒ)}_(flip) inaccordance with equation (7), minimize the squared distance to ƒ^(H). Inother words these grid vectors, when used in equation (7), give the bestrepresentation of the high-frequency LSF coefficients.

FIG. 5 is a block diagram of an embodiment of the encoder in accordancewith the disclosed technology. The encoder 40 includes a low-frequencyencoder 10 configured to encode a low-frequency part of the parametricspectral representation ƒ by quantizing elements of the parametricspectral representation that correspond to a low-frequency part of theaudio signal. The encoder 40 also includes a high-frequency encoder 12configured to encode a high-frequency part ƒ^(H) of the parametricspectral representation by weighted averaging based on the quantizedelements {circumflex over (ƒ)}^(L) flipped around a quantized mirroringfrequency separating the low-frequency part from the high-frequencypart, and a frequency grid determined from a frequency grid codebook 24in a closed-loop search procedure. The quantized entities {circumflexover (ƒ)}^(L), {circumflex over (ƒ)}_(m), g^(opt) are represented by thecorresponding quantization I_(ƒL), I_(m), I_(g), which are transmittedto the decoder.

FIG. 6 is a block diagram of an embodiment of the encoder in accordancewith the disclosed technology. The low-frequency encoder 10 receives theentire LSF vector ƒ, which is split into a low-frequency part orsubvector ƒ^(L) and a high-frequency part or subvector ƒ^(H) by a vectorsplitter 14. The low-frequency part is forwarded to a quantizer 16,which is configured to encode the low-frequency part ƒ^(L) by quantizingits elements, either by scalar or vector quantization, into a quantizedlow-frequency part or subvector {circumflex over (ƒ)}^(L). At least onequantization index I_(ƒL) (depending on the quantization method used) isoutputted for transmission to the decoder.

The quantized low-frequency subvector {circumflex over (ƒ)}^(L) and thenot yet encoded high-frequency subvector ƒ^(H) are forwarded to thehigh-frequency encoder 12. A mirroring frequency calculator 18 isconfigured to calculate the quantized mirroring frequency {circumflexover (ƒ)}_(m) in accordance with equation (2). The dashed lines indicatethat only the last quantized element {circumflex over (ƒ)}(M/2−1) in{circumflex over (ƒ)}^(L) first element ƒ(M/2) in ƒ^(H) are required forthis. The quantization index I_(m) representing the quantized mirroringfrequency {circumflex over (ƒ)}_(m) is outputted for transmission to thedecoder.

The quantized mirroring frequency {circumflex over (ƒ)}_(m) is forwardedto a quantized low-frequency subvector flipping unit 20 configured toflip the elements of the quantized low-frequency subvector {circumflexover (ƒ)}^(L) around the quantized mirroring frequency {circumflex over(ƒ)}_(m) in accordance with equation (3). The flipped elementsƒ_(flip)(k) and the quantized mirroring frequency {circumflex over(ƒ)}_(m) are forwarded to a flipped element rescaler 22 configured torescale the flipped elements in accordance with equation (4).

The frequency grids g^(i)(k) are forwarded from frequency grid codebook24 to a frequency grid rescaler 26, which also receives the lastquantized element {circumflex over (ƒ)}(M/2−1) in {circumflex over(ƒ)}^(L). The rescaler 26 is configured to perform rescaling inaccordance with equation (5).

The flipped and rescaled LSFs {tilde over (ƒ)}_(flip)(k) from flippedelement rescaler 22 and the rescaled frequency grids {tilde over(g)}^(i)(k) from frequency grid rescaler 26 are forwarded to a weightingunit 28, which is configured to perform a weighted averaging inaccordance with equation (7). The resulting smoothed elements ƒ_(smooth)^(i)(k) and the high-frequency target vector ƒ^(H) are forwarded to afrequency grid search unit 30 configured to select a frequency gridg^(opt) in accordance with equation (13). The corresponding index I_(g)is transmitted to the decoder.

Decoder

FIG. 7 is a flow chart of the decoding method in accordance with thedisclosed technology. Step S11 reconstructs elements of a low-frequencypart of the parametric spectral representation corresponding to alow-frequency part of the audio signal from at least one quantizationindex encoding that part of the parametric spectral representation. StepS12 reconstructs elements of a high-frequency part of the parametricspectral representation by weighted averaging based on the decodedelements flipped around a decoded mirroring frequency, which separatesthe low-frequency part from the high-frequency part, and a decodedfrequency grid.

The method steps performed at the decoder are illustrated by theembodiment in FIG. 8. First the quantization indices I_(ƒL), I_(m),I_(g) for the low-frequency LSFs, optimal mirroring frequency andoptimal grid, respectively, are received.

In step S13 the quantized low-frequency part {circumflex over (ƒ)}^(L)is reconstructed from a low-frequency codebook by using the receivedindex I_(ƒL).

The method steps performed at the decoder for reconstructing thehigh-frequency part {circumflex over (ƒ)}^(H) are very similar toalready described encoder processing steps in equations (3)-(7).

The flipping and rescaling steps performed at the decoder (at S14) areidentical to the encoder operations, and therefore described exactly byequations (3)-(4).

The steps (at S15) of rescaling the grid (equation (5)), and smoothingwith it (equation (6)), require only slight modification in the decoder,because the closed loop search is not performed (search over i). This isbecause the decoder receives the optimal index opt from the bit stream.These equations instead take the following form:{tilde over (g)} ^(opt)(k)=g ^(opt)(k)·(g _(max)−{circumflex over(ƒ)}(M/2−1))+{circumflex over (ƒ)}(M/2−1)  (14)andƒ_(smooth)(k)=[1−λ(k)]{tilde over (ƒ)}_(flip)(k)+λ(k){tilde over (g)}^(opt)(k)  (15)respectively. The vector ƒ_(smooth) represents the high frequency part{circumflex over (ƒ)}^(H) of the deocded signal.

Finally the low- and high-frequency parts {circumflex over (ƒ)}^(L),{circumflex over (ƒ)}^(H) of the LSF vector are combined in step S16,and the resulting vector {circumflex over (ƒ)} is transformed to ARcoefficients â in step S17.

FIG. 9 is a block diagram of an embodiment of the decoder 50 inaccordance with the disclosed technology. A low-frequency decoder 60 isconfigures to reconstruct elements {circumflex over (ƒ)}^(L) of alow-frequency part ƒ^(L) of the parametric spectral representation ƒcorresponding to a low-frequency part of the audio signal from at leastone quantization index I_(ƒL) encoding that part of the parametricspectral representation. A high-frequency decoder 62 is configured toreconstruct elements {circumflex over (ƒ)}^(H) of a high-frequency partƒ^(H) of the parametric spectral representation by weighted averagingbased on the decoded elements {circumflex over (ƒ)}^(L) flipped around adecoded mirroring frequency {circumflex over (ƒ)}_(m), which separatesthe low-frequency part from the high-frequency part, and a decodedfrequency grid g^(opt). The frequency grid g^(opt) is obtained byretrieving the frequency grid that corresponds to a received index I_(g)from a frequency grid codebook 24 (this is the same codebook as in theencoder).

FIG. 10 is a block diagram of an embodiment of the decoder in accordancewith the disclosed technology. The low-frequency decoder receives atleast one quantization index I_(ƒL), depending on whether scalar orvector quantization is used, and forwards it to a quantization indexdecoder 66, which reconstructs elements {circumflex over (ƒ)}^(L) of thelow-frequency part of the parametric spectral representation. Thehigh-frequency decoder 62 receives a mirroring frequency quantizationindex I_(m), which is forwarded to a mirroring frequency decoder 66 fordecoding the mirroring frequency {circumflex over (ƒ)}_(m). Theremaining blocks 20, 22, 24, 26 and 28 perform the same functions as thecorrespondingly numbered blocks in the encoder illustrated in FIG. 6.The essential differences between the encoder and the decoder are thatthe mirroring frequency is decoded from the index I_(m) instead of beingcalculated from equation (2), and that the frequency grid search unit 30in the encoder is not required, since the optimal frequency grid isobtained directly from frequency grid codebook 24 by looking up thefrequency grid g^(opt) that corresponds to the received index I_(g).

The steps, functions, procedures and/or blocks described herein may beimplemented in hardware using any conventional technology, such asdiscrete circuit or integrated circuit technology, including bothgeneral-purpose electronic circuitry and application-specific circuitry.

Alternatively, at least some of the steps, functions, procedures and/orblocks described herein may be implemented in software for execution bysuitable processing equipment. This equipment may include, for example,one or several micro processors, one or several Digital SignalProcessors (DSP), one or several Application Specific IntegratedCircuits (ASIC), video accelerated hardware or one or several suitableprogrammable logic devices, such as Field Programmable Gate Arrays(FPGA). Combinations of such processing elements are also feasible.

It should also be understood that it may be possible to reuse thegeneral processing capabilities already present in a UE. This may, forexample, be done by reprogramming of the existing software or by addingnew software components.

FIG. 11 is a block diagram of an embodiment of the encoder 40 inaccordance with the disclosed technology. This embodiment is based on aprocessor 110, for example a micro processor, which executes software120 for quantizing the low-frequency part ƒ^(L) of the parametricspectral representation, and software 130 for search of an optimalextrapolation represented by the mirroring frequency {circumflex over(ƒ)}_(m) and the optimal frequency grid vector g^(opt). The software isstored in memory 140. The processor 110 communicates with the memoryover a system bus. The incoming parametric spectral representation ƒ isreceived by an input/output (I/O) controller 150 controlling an I/O bus,to which the processor 110 and the memory 140 are connected. Thesoftware 120 may implement the functionality of the low-frequencyencoder 10. The software 130 may implement the functionality of thehigh-frequency encoder 12. The quantized parameters {circumflex over(ƒ)}^(L), {circumflex over (ƒ)}_(m), g^(opt) (or preferably thecorresponding indices I_(ƒL), I_(m), I_(g)) obtained from the software120 and 130 are outputted from the memory 140 by the I/O controller 150over the I/O bus.

FIG. 12 is a block diagram of an embodiment of the decoder 50 inaccordance with the disclosed technology. This embodiment is based on aprocessor 210, for example a micro processor, which executes software220 for decoding the low-frequency part ƒ^(L) of the parametric spectralrepresentation, and software 230 for decoding the low-frequency partƒ^(H) of the parametric spectral representation by extrapolation. Thesoftware is stored in memory 240. The processor 210 communicates withthe memory over a system bus. The incoming encoded parameters{circumflex over (ƒ)}^(L), {circumflex over (ƒ)}_(m), g^(opt)(represented by I_(ƒL), I_(m), I_(g)) are received by an input/output(I/O) controller 250 controlling an I/O bus, to which the processor 210and the memory 240 are connected. The software 220 may implement thefunctionality of the low-frequency decoder 60. The software 230 mayimplement the functionality of the high-frequency decoder 62. Thedecoded parametric representation {circumflex over (ƒ)} ({circumflexover (ƒ)}^(L) combined with {circumflex over (ƒ)}^(H)) obtained from thesoftware 220 and 230 are outputted from the memory 240 by the I/Ocontroller 250 over the I/O bus.

FIG. 13 illustrates an embodiment of a user equipment UE including anencoder in accordance with the disclosed technology. A microphone 70forwards an audio signal to an A/D converter 72. The digitized audiosignal is encoded by an audio encoder 74. Only the components relevantfor illustrating the disclosed technology are illustrated in the audioencoder 74. The audio encoder 74 includes an AR coefficient estimator76, an AR to parametric spectral representation converter 78 and anencoder 40 of the parametric spectral representation. The encodedparametric spectral representation (together with other encoded audioparameters that are not needed to illustrate the present technology) isforwarded to a radio unit 80 for channel encoding and up-conversion toradio frequency and transmission to a decoder over an antenna.

FIG. 14 illustrates an embodiment of a user equipment UE including adecoder in accordance with the disclosed technology. An antenna receivesa signal including the encoded parametric spectral representation andforwards it to radio unit 82 for down-conversion from radio frequencyand channel decoding. The resulting digital signal is forwarded to anaudio decoder 84. Only the components relevant for illustrating thedisclosed technology are illustrated in the audio decoder 84. The audiodecoder 84 includes a decoder 50 of the parametric spectralrepresentation and a parametric spectral representation to AR converter86. The AR coefficients are used (together with other decoded audioparameters that are not needed to illustrate the present technology) todecode the audio signal, and the resulting audio samples are forwardedto a D/A conversion and amplification unit 88, which outputs the audiosignal to a loudspeaker 90.

In one example application the disclosed AR quantization-extrapolationscheme is used in a BWE context. In this case AR analysis is performedon a certain high frequency band, and AR coefficients are used only forthe synthesis filter. Instead of being obtained with the correspondinganalysis filter, the excitation signal for this high band isextrapolated from an independently coded low band excitation.

In another example application the disclosed ARquantization-extrapolation scheme is used in an ACELP type codingscheme. ACELP coders model a speaker's vocal tract with an AR model. Anexcitation signal e(n) is generated by passing a waveform s(n) through awhitening filter e(n)=A(z)s(n), where A(z)=1+a₁z⁻²+ . . . +a_(M)z^(−M),is the AR model of order M. On a frame-by-frame basis a set of ARcoefficients a=[a₁a₂ . . . a_(M)]^(T), and excitation signal arequantized, and quantization indices are transmitted over the network. Atthe decoder, synthesized speech is generated on a frame-by-frame basisby sending the reconstructed excitation signal through the reconstructedsynthesis filter A(z)⁻¹.

In a further example application the disclosed ARquantization-extrapolation scheme is used as an efficient way toparameterize a spectrum envelope of a transform audio codec. Onshort-time basis the waveform is transformed to frequency domain, andthe frequency response of the AR coefficients is used to approximate thespectrum envelope and normalize transformed vector (to create a residualvector). Next the AR coefficients and the residual vector are coded andtransmitted to the decoder.

It will be understood by those skilled in the art that variousmodifications and changes may be made to the disclosed technologywithout departure from the scope thereof, which is defined by theappended claims.

ABBREVIATIONS

-   -   ACELP Algebraic Code Excited Linear Prediction    -   ASIC Application Specific Integrated Circuits    -   AR Auto Regression    -   BWE Bandwidth Extension    -   DSP Digital Signal Processor    -   FPGA Field Programmable Gate Array    -   ISP Immitance Spectral Pairs    -   LP Linear Prediction    -   LSF Line Spectral Frequencies    -   LSP Line Spectral Pair    -   MSE Mean Squared Error    -   SD Spectral Distortion    -   SQ Scalar Quantizer    -   UE User Equipment    -   VQ Vector Quantization

REFERENCES

-   -   [1] 3GPP TS 26.090, “Adaptive Multi-Rate (AMR) speech codec;        Transcoding functions”, p. 13, 2007    -   [2] N. Iwakami, et al., High-quality audio-coding at less than        64 kbit/s by using transform-domain weighted interleave vector        quantization (TWINVQ), IEEE ICASSP, vol. 5, pp. 3095-3098, 1995    -   [3] J. Makhoul, “Linear prediction: A tutorial review”, Proc.        IEEE, vol 63, p. 566, 1975    -   [4] P. Kabal and R. P. Ramachandran, “The computation of line        spectral frequencies using Chebyshev polynomials”, IEEE Trans.        on ASSP, vol. 34, no. 6, pp. 1419-1426, 1986

What is claimed is:
 1. A method, comprising: encoding an audio signal,wherein encoding the audio signal comprises obtaining a parametricspectral representation (ƒ) of auto-regressive coefficients (a) thatpartially represent the audio signal, encoding a low-frequency part(ƒ^(L)) of the parametric spectral representation (ƒ) by quantizingcoefficients of the parametric spectral representation that correspondto a low-frequency part of the audio signal, and encoding ahigh-frequency part (ƒ^(H)) of the parametric spectral representation(ƒ) by weighted averaging based on the quantized coefficients({circumflex over (ƒ)}^(L)) flipped around a quantized mirroringfrequency ({circumflex over (ƒ)}_(m)), which separates the low-frequencypart from the high-frequency part, and a frequency grid codebookobtained in a closed-loop search procedure; and outputting, fortransmission to a decoder, at least one quantitation index (I_(ƒL))representing the quantized coefficients ({circumflex over (ƒ)}^(L)), aquantization index (I_(m)) representing the quantized mirroring ƒfrequency ({circumflex over (ƒ)}_(m)) and a quantization index (I_(g))representing a frequency grid (g^(opt)).
 2. The method of claim 1,further comprising transmitting encoded audio to a decoder, the encodedaudio comprising the at least one quantitation index (I_(ƒL)), thequantization index (I_(m)), and the quantization index (I_(g)).
 3. Themethod of claim 1, wherein encoding the audio signal further comprisesquantizing the mirroring frequency {circumflex over (ƒ)}_(m) inaccordance with:{circumflex over (ƒ)}_(m) =Q(ƒ(M/2)−{circumflex over(ƒ)}(M/2−1))+{circumflex over (ƒ)}(M/2−1), where Q denotes quantizationof the expression in the adjacent parenthesis, M denotes the totalnumber of coefficients in the parametric spectral representation, ƒ(M/2)denotes the first coefficient in the high-frequency part, and{circumflex over (ƒ)}(M/2−1) denotes the last quantized coefficient inthe low-frequency part.
 4. The method of claim 3, wherein encoding theaudio signal further comprises flipping the quantized coefficients ofthe low frequency part (ƒ^(L)) of the parametric spectral representation(ƒ) around the quantized mirroring frequency {circumflex over (ƒ)}_(m)in accordance with:ƒ_(flip)(k)=2{circumflex over (ƒ)}_(m)−{circumflex over (ƒ)}(M/2−1−k),0≤k≤M/2−1, where {circumflex over (ƒ)}(M/2−1−k) denotes quantizedcoefficient M/2−1−k.
 5. The method of claim 4, wherein encoding theaudio signal further comprises rescaling the flipped coefficientsƒ_(flip)(k) in accordance with:${{\overset{\sim}{f}}_{flip}(k)} = \left\{ {\begin{matrix}{{{\left( {{f_{flip}(k)} - {f_{flip}(0)}} \right) \cdot {\left( {f_{\max} - {\overset{\hat{}}{f}}_{m}} \right)/{\overset{\hat{}}{f}}_{m}}} + {f_{flip}(0)}}\ ,} & {{\overset{\hat{}}{f}}_{m} > {{0.2}5}} \\{{f_{flip}(k)}\ ,} & {otherwise}\end{matrix}.} \right.$
 6. The method of claim 5, wherein encoding theaudio signal further comprises rescaling the frequency grids g^(i) fromthe frequency grid codebook to fit into the interval between the lastquantized coefficient {circumflex over (ƒ)}(M/2−1) in the low-frequencypart and a maximum grid point value g_(max) in accordance with:{tilde over (g)} ^(i)(k)=g ^(i)(k)·(g _(max)−{circumflex over(ƒ)}(M/2−1))+{circumflex over (ƒ)}(M/2−1).
 7. The method of claim 6,wherein encoding the audio signal further comprises weighted averagingof the flipped and rescaled coefficients {tilde over (ƒ)}_(flip)(k) andthe rescaled frequency grids {tilde over (g)} ^(i)(k) in accordancewith:ƒ_(smooth) ^(i)(k)=[1−λ(k)]{tilde over (ƒ)}_(flip)(k)+λ(k){tilde over(g)}^(i)(k) where λ(k) and [1−λ(k)] are predefined weights.
 8. Themethod of claim 7, wherein encoding the audio signal further comprisesselecting a frequency grid g^(opt), where the index opt satisfies thecriterion:${opt} = {\underset{i}{\arg\min}\left( {\sum\limits_{k = 0}^{{M/2} - 1}\left( {{f_{smooth}^{i}(k)} - {f^{H}(k)}} \right)^{2}} \right)}$where ƒ^(H)(k) is a target vector formed by the coefficients of thehigh-frequency part of the parametric spectral representation.
 9. Themethod of claim 8, wherein M=10, g_(max)=0.5, and the weights λ(k) aredefined as λ={0.2, 0.35, 0.5, 0.75, 0.8}.
 10. The method of claim 1,wherein the encoding of the parametric spectral representation (ƒ) ofauto-regressive coefficients is performed on a line spectral frequenciesrepresentation of the auto-regressive coefficients.
 11. An encodingapparatus, comprising: an audio encoding circuit configured to: encodean audio signal by obtaining a parametric spectral representation (ƒ) ofauto-regressive coefficients (a) that partially represent the audiosignal, encoding a low-frequency part (ƒ^(L)) of the parametric spectralrepresentation (ƒ) by quantizing coefficients of the parametric spectralrepresentation that correspond to a low-frequency part of the audiosignal, and encoding a high-frequency part (ƒ^(H)) of the parametricspectral representation (ƒ) by weighted averaging based on the quantizedcoefficients ({circumflex over (ƒ)}^(L)) flipped around a quantizedmirroring frequency ({circumflex over (ƒ)}_(m)), which separates thelow-frequency part from the high-frequency part, and a frequency gridcodebook obtained in a closed-loop search procedure; and output, fortransmission to a decoder, at least one quantitation index (I_(ƒL))representing the quantized coefficients ({circumflex over (ƒ)}^(L)), aquantization index (I_(m)) representing the quantized mirroring ƒfrequency ({circumflex over (ƒ)}_(m)), and a quantization index (I_(g))representing a frequency grid (g^(opt)).
 12. The encoding apparatus ofclaim 11, further comprising output circuitry configured to transmitencoded audio to a decoder, the encoded audio comprising the at leastone quantitation index (I_(ƒL)), the quantization index (I_(m)), and thequantization index (I_(g)).
 13. The encoding apparatus of claim 11,wherein the audio encoding circuit is further configured to quantize themirroring frequency {circumflex over (ƒ)}_(m) in accordance with:{circumflex over (ƒ)}_(m) =Q(ƒ(M/2)−{circumflex over(ƒ)}(M/2−1))+{circumflex over (ƒ)}(M/2−1), where Q denotes quantizationof the expression in the adjacent parenthesis, M denotes the totalnumber of coefficients in the parametric spectral representation, ƒ(M/2)denotes the first coefficient in the high-frequency part, and{circumflex over (ƒ)}(M/2−1) denotes the last quantized coefficient inthe low-frequency part.
 14. The encoding apparatus of claim 13, whereinthe audio encoding circuit is further configured to flip the quantizedcoefficients of the low frequency part (ƒ^(L)) of the parametricspectral representation (ƒ) around the quantized mirroring frequency{circumflex over (ƒ)}_(m), in accordance with:ƒ_(flip)(k)=2{circumflex over (ƒ)}_(m)−{circumflex over (ƒ)}(M/2−1−k),0≤k≤M/2−1 where {circumflex over (ƒ)}(M/2−1−k) denotes the quantizedcoefficient M/2−1−k.
 15. The encoding apparatus of claim 14, wherein theaudio encoding circuit is further configured to rescale the flippedcoefficients ƒ_(flip)(k) in accordance with:${{\overset{\sim}{f}}_{flip}(k)} = \left\{ \begin{matrix}{{{\left( {{f_{flip}(k)} - {f_{flip}(0)}} \right) \cdot {\left( {f_{\max} - {\overset{\hat{}}{f}}_{m}} \right)/{\overset{\hat{}}{f}}_{m}}} + {f_{flip}(0)}}\ ,} & {{\overset{\hat{}}{f}}_{m} > {{0.2}5}} \\{{f_{flip}(k)}\ ,} & {otherwise}\end{matrix} \right.$
 16. The encoding apparatus of claim 15, whereinthe audio encoding circuit is further configured to rescale thefrequency grids g^(i) from the frequency grid codebook to fit into theinterval between the last quantized coefficient {circumflex over(ƒ)}(M/2−1) in the low-frequency part and a maximum grid point valueg_(max) in accordance with:{tilde over (g)} ^(i)(k)=g ^(i)(k)·(g _(max)−{circumflex over(ƒ)}(M/2−1))+{circumflex over (ƒ)}(M/2−1).
 17. The encoding apparatus ofclaim 16, wherein the audio encoding circuit is further configured toperform weighted averaging of the flipped and rescaled coefficients{tilde over (ƒ)}_(flip)(k) and the rescaled frequency grids {tilde over(g)}^(i)(k) in accordance with:ƒ_(smooth) ^(i)(k)=[1−λ(k)]{tilde over (ƒ)}_(flip)(k)+λ(k){tilde over(g)} ^(i)(k) where λ(k) and [1−λ(k)] are predefined weights.
 18. Theencoding apparatus of claim 17, wherein the audio encoding circuit isfurther configured to select a frequency grid g^(opt), where the indexopt satisfies the criterion:${opt} = {\underset{i}{\arg\min}\left( {\sum\limits_{k = 0}^{{M/2} - 1}\left( {{f_{smooth}^{i}(k)} - {f^{H}(k)}} \right)^{2}} \right)}$where ƒ^(H)(k) is a target vector formed by the coefficients of thehigh-frequency part of the parametric spectral representation.
 19. Theencoding apparatus of claim 18, wherein M=10, g_(max)=0.5, and theweights λ(k) are defined as λ={0.2, 0.35, 0.5, 0.75, 0.8}.
 20. Theencoding apparatus of claim 11, wherein the audio encoding circuit isconfigured perform encoding of the parametric spectral representation(ƒ) of auto-regressive coefficients on a line spectral frequenciesrepresentation of the auto-regressive coefficients.